The CALBC RDF Triple Store: Retrieval over Large Literature Content

نویسندگان

  • Samuel Croset
  • Christoph Grabmüller
  • Chen Li
  • Silvestras Kavaliauskas
  • Dietrich Rebholz-Schuhmann
چکیده

Integration of the scientific literature into a biomedical research infrastructure requires the processing of the literature, identification of the contained named entities (NEs) and concepts, and to represent the content in a standardised way. The CALBC project partners (PPs) have produced a large-scale annotated biomedical corpus with four different semantic groups through the harmonisation of annotations from automatic text mining solutions (Silver Standard Corpus, SSC). The four semantic groups were chemical entities and drugs (CHED), genes and proteins (PRGE), diseases and disorders (DISO) and species (SPE). The content of the SSC has been fully integrated into RDF Triple Store (4,568,678 triples) and has been aligned with content from the GeneAtlas (182,840 triples), UniProtKb (12,552,239 triples for human) and the lexical resource LexEBI (BioLexicon). RDF Triple Store enables querying the scientific literature and bioinformatics resources at the same time for evidence of genetic causes, such as drug targets and disease involvement.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application-Specific Schema Design for Storing Large RDF Datasets

In order to realize the vision of the Semantic Web, a semantic model for encoding content in the World Wide Web, efficient storage and retrieval of large RDF data sets is required. A common technique for storing RDF data (graphs) is to use a single relational database table, a triple store, for the graph. However, we believe a single triple store cannot scale for the needs of large-scale applic...

متن کامل

Supporting Scalable, Persistent Semantic Web Applications

To realize the vision of the Semantic Web, efficient storage and retrieval of large RDF data sets is required. A common technique for persisting RDF data (graphs) is to use a single relational database table, a triple store. But, we believe a single triple store cannot scale for large-scale applications. This paper describes storing and querying persistent RDF graphs in Jena, a Semantic Web pro...

متن کامل

Modular P2P-Based Approach for RDF Data Storage and Retrieval

One of the key elements of the Semantic Web is the Resource Description Framework (RDF). Efficient storage and retrieval of RDF data in large scale settings is still challenging and existing solutions are monolithic and thus not very flexible from a software engineering point of view. In this paper, we propose a modular system, based on the scalable Content-Addressable Network (CAN), which give...

متن کامل

Jena Property Table Implementation

A common approach to providing persistent storage for RDF is to store statements in a three-column table in a relational database system. This is commonly referred to as a triple store. Each table row represents one RDF statement. For RDF graphs with frequent patterns, an alternative storage scheme is a property table. A property table comprises one column containing a statement subject plus on...

متن کامل

Policy-Based Access Control for an RDF Store

Specialized stores for RDF data are essential parts of many Semantic Web applications. Current RDF stores have primarily focused on efficiently storing and querying large volumes of data and little attention has been given other features common to many database systems, including how information can updated and maintained or access to data controlled. The problem is complicated by the fact that...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1012.1650  شماره 

صفحات  -

تاریخ انتشار 2010